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For the usual straight-line model, in which the independent variable takes on a fixed, known set 
of values, it is shown that the sample correlation coefficient is distributed as Q with (n — 2) degrees 



of freedom and noncentrality 6—((3I(t) Vl(jtj-J) 2 . The Q variate has been defined and studied else- 
where by Hogben et al. It is noted that the square of the correlation coefficient is distributed as a 
noncentral beta variable. 
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1 . Introduction 

Consider the straight-line model 

Yi = a+ /&* + €*, i = l, 2, ...,n (1) 

where 

(i) the €, are assumed to behave as normally and independently distributed random variables 
with mean zero and common variance cr 2 , 

(ii) a and ft are unknown parameters, and 

(iii) lowercase italic letters denote fixed, known constants and uppercase italic letters denote 
random variables, i.e., x is fixed and Y is random. This and other straight-line models are dis- 
cussed in detail by Acton [1959]. 

The sample correlation coefficient r xY is defined by 



J d {x,-x){Y i -Y) 
r x r= i = ; _ , (2) 



where x = \ x\\n and Y= V Yj/n. 



It is of some interest in the calibration problem where a fitted straight-line is used in reverse for 
estimating an unknown Xq corresponding to an observed F<). The distribution of rxy where X and Y 
follow the bivariate normal is well known; see for example Kendall and Stuart [1961, pp. 383-390]. 
In the present paper the distribution of r x y as defined by (2) is derived for x fixed. This distribution 
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is well known for the special case with all Y\ identically distributed (i.e., /3 = 0), in which case it 
is the same as the distribution of r X y for X, Y independent and normal. See, e.g., Hotelling [1953, 
p. 196] . J. N. K. Rao and an unidentified person have pointed out that the distribution of r% Y can 
be obtained as a special case of the conditional distribution of the multiple correlation coefficient 
for the multi-variate normal; see, e.g., C. R. Rao [1965, p. 509]. 

2. Derivation 

In an analysis of variance for the model (1) the (corrected) total sum of squares with (n — 1) 
degrees of freedom may be partitioned into two independent components; the first being the sum 
of squares due to the slope with 1 degree of freedom and the second being the residual sum of 
squares with (n — 2) degrees of freedom. This partition can be expressed by 

Wixi-Wi-Y)]* 

X(^i-J0 2 = - L + J i (Y i -Y i )\ 

5>"5) 2 (3) 

where ?t= Y+(3( Xi -x) 

and /3 = 2(*i-*)(y<-?)/j;(*i-*) 2 

i-\ I i - l 

is the usual least squares estimator for (3. Let the random variables W and X 2 be defined by 

2(*-i)(r«-T) 

W= (4) 

and X* = J i (Y i -t,)Va a . (5) 

Using (4) and (5) and dividing both sides of eq (3) by cr 2 we have 

y (Y t -Y) 2 ia a =w 2 +x 2 . (6) 

i=l 

If both the numerator and denominator of r x y are divided by cr 2 and the first factor of the 
denominator is combined with the numerator, the correlation coefficient may be written as 

w 

r xY = —7= (7) 

Vr+1 2 

Since the Y\ are normally distributed, it is easily shown that W is normally distributed with mean 



6= (/3/cr) V 2 (at/ — x) 2 and variance 1. Further, it is well known from the theory of the general 
linear hypothesis that under model (1) W and^f 2 are independently distributed andZ 2 is distributed 
as chi-squared with (n — 2) degrees of freedom. Therefore, r x y is equal to the random variable Q 
defined and studied in Hogben et al., [1964a] and [1964b]. Hence, the following theorem is proved. 

THEOREM: The correlation coefficient r xY , defined by (2) under model (1), is distributed as Q with 



(n — 2) degrees of freedom and noncentrality 6= (/3lcr)\^(xi — x) 2 . 

34 



Various properties of Q are given in the previous two references, including analytic expres- 
sions and recurrence relations for the moments about zero, numerical values for the first four 
central moments and an approximation to the distribution of Q by that of a linearly transformed 
beta variable. It follows from (7) that t%y is distributed as noncentral beta; see for example Seber 



[1963], where in his notation »i = l, ri2 = n — 2 and \=0 2 /2. Furthermore, t=v{n — 2)r 2 /(l — r 2 ) 
is distributed as noncentral t with noncentrality 6 and (n — 2) degrees of freedom. The distribution 
of r X Y also follows from the interesting and easily derived relation 



r x y 



V/3 2 +(rc-2)5 2 



Thanks go to Joan Rosenblatt for pointing out the looseness of an earlier proof of the theorem 
and to her and Edwin L. Crow for constructive suggestions. 
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